Automated syntactic text description enhancement : determination analysis

نویسندگان

  • Jules Duchastel
  • Louis-Claude Paquin
  • Jacques Beauchemin
چکیده

This paper seeks to make a pragmatic contribution to computer-assisted discourse analysis by showing how a syntactical description could be used to gain a deeper understanding of the texts. During the '80s, a large corpus (5,000 pages) of varied political discourses covering twenty five years (1934-1960) was collected and a double description of it was carried out. On the one hand, to deal with the occurrence of terms referring to the same notions in the lexicon, a set of 144 sociological categories was assigned to relevant expressions. For example, the following terms were tagged as 'financial notions' : bank, credit, savings, dividend, etc. On the other hand, a surface syntactical description of each sentence of the corpus was produced by means of a parser (Plante 1979) controlled by an heuristic strategy and programmed in Déredec (a LISP sub-language). The use of a syntactical description was founded on the hypothesis that it would reduce the random nature of the lexical distribution of the words in the texts. A lexicon of the clause subjects is more 'qualified' than a lexicon with no criteria. This particular description identifies three contextual dependency relations for every clause, namely: theme/rheme, determination and verbal arguments.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automated syntactic text description enhancement: The thematic structure of discourse utterances

Our work aims at the optimization of existing tools for computer-assisted description and analysis of textual data.2 More specifically, we want to proceed to the thematic description of clauses and clause complexes of Quebec Budget speeches from 1934 to 1960. Our main objective is to enhance the work already done in this direction (Bourque and Duchastel, 1988)3 by elaborating the analytic frame...

متن کامل

Morphological Analysis as a Step in Automated Syntactic Analysis of a Text

The general purpose of this study is to investigate the possibility of an almost completely automated syntactic analysis of a given text of a known language. This has in itself some theoretical linguistic interest, and to the extent that it succeeds, it will save a large amount of labour in relation to automated indexing and translation, etc.. One step in this analysis is to count the occurrenc...

متن کامل

Procedures for the Determination of Distributional Classes

STUDIES in Distributional Semantics are now underway at The RAND Corporation based on the 250,000 word corpus of Russian Physics text.** For present purposes, it is important to note that this text has been subjected to machine translation and human post-editing, and that a glossary of the forms found in the text and a syntactic description of each sentence is preserved on magnetic tape. The sy...

متن کامل

Design of a hybrid high quality machine translation system

This paper gives an overview of the ongoing FP7 project HyghTra (2010 – 2014). The HyghTra project is conducted in a partnership between academia and industry involving the University of Leeds and Lingenio GmbH (company). It adopts a hybrid and bootstrapping approach to the enhancement of MT quality by applying rule-based analysis and statistical evaluation techniques to both parallel and compa...

متن کامل

A Linguistic Analysis of Conference Titles in Applied Linguistics

Over the past twenty-five years, researchers have expressed considerable interest in titles of academic publications. Unfortunately, conference paper titles (CPTs) have only recently begun to receive attention. The aim of this study, therefore, is to investigate the text length, syntactic structure, and lexicon of CPTs in Applied Linguistics. A data set of 698 titles was selected from the 2008 ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001